Pass in sourceId tag in all cases #10464

annzhang-db · 2023-11-20T21:22:02Z

🛠 DevTools 🛠

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10464/merge

Checkout with GitHub CLI

gh pr checkout 10464

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

github-actions · 2023-11-20T21:22:24Z

Documentation preview for 61b1e50 will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/7156901303.

mlflow/tracking/default_experiment/databricks_notebook_experiment_provider.py

dbczumar · 2023-11-27T23:28:56Z

mlflow/tracking/default_experiment/databricks_notebook_experiment_provider.py

        try:
            experiment_id = MlflowClient().create_experiment(source_notebook_name, None, tags)
        except MlflowException as e:
            if e.error_code == databricks_pb2.ErrorCode.Name(
                databricks_pb2.INVALID_PARAMETER_VALUE
            ):
-                # If repo notebook experiment creation isn't enabled, fall back to
-                # using the notebook ID
+                # If determined that it is not a repo notebook
                experiment_id = source_notebook_id
            else:
                raise e


@annzhang-db I thought we would get a RESOURCE_ALREADY_EXISTS exception if we call create_experiment() with a non-repo notebook path that already exists. Is that what the backend does?

If the backend does indeed return RESOURCE_ALREADY_EXISTS, I think that would break the current implementation of DatabricksNotebookExperimentProvider in this PR; have we tested this thoroughly?

Yes, we get a RESOURCE_ALREADY_EXISTS exception if we call create_experiment() with a non-repo notebook path that already exists AND no sourceType/sourceId tags passed in. Since we are passing in the sourceId here, it will actually go into the previous case (in the backend PR, ill leave a comment there) and raise INVALID_PARAMETER_VALUE error for sourceId but no sourceType.

So if I log a param in a non-repo notebook, then attach/detach from the cluster, then try to log a param again, won't this default experiment provider try to call create_experiment() under the hood and then fail at the user level with RESOURCE_ALREADY_EXISTS, which will break the user's workflow?

It will call create_experiment() with a sourceId tag, which will not fail with RESOURCE_ALREADY_EXISTS. Only if create_experiment() is called without sourceId tag will it fail with RESOURCE_ALREADY_EXISTS.

Ah, got it!

dbczumar

LGTM! Thanks @annzhang-db !

dbczumar

LGTM once we add an integration suite test case: https://src.dev.databricks.com/databricks/universe/-/blob/mlflow/src/test/e2e/tracking/DefaultRepoNotebookExperimentIntegrationSuite.scala. Thanks @annzhang-db

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

github-actions bot added the rn/none List under Small Changes in Changelogs. label Nov 20, 2023

annzhang-db changed the title ~~Pass tags for repos~~ Pass in sourceId tag Nov 20, 2023

annzhang-db changed the title ~~Pass in sourceId tag~~ Pass in sourceId tag in all cases Nov 20, 2023

annzhang-db force-pushed the repos branch from 4f49f51 to cfcfaf2 Compare November 21, 2023 21:28

dbczumar reviewed Nov 21, 2023

View reviewed changes

mlflow/tracking/default_experiment/databricks_notebook_experiment_provider.py Outdated Show resolved Hide resolved

annzhang-db requested a review from dbczumar November 21, 2023 23:52

dbczumar reviewed Nov 27, 2023

View reviewed changes

dbczumar approved these changes Nov 30, 2023

View reviewed changes

annzhang-db added 12 commits December 10, 2023 00:38

inital

5825d4c

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

fix test

8bfb9c8

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

logs

1208c10

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

fix logging

9f7ea48

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

try

231e4d2

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

update

f981dd3

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

update

e211234

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

fix

f670eca

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

remove DatabricksRepoNotebookExperimentProvider

eda95d5

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

format

da482cd

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

fix tests

8330b12

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

update

9bbd75f

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

annzhang-db force-pushed the repos branch from 99200f1 to 9bbd75f Compare December 10, 2023 08:38

annzhang-db added 2 commits December 10, 2023 00:53

format

2ad0d17

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

update

61b1e50

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

annzhang-db enabled auto-merge (squash) December 10, 2023 09:24

annzhang-db merged commit ef62077 into mlflow:master Dec 10, 2023
36 checks passed

harupy mentioned this pull request Dec 13, 2023

Run python3 dev/update_mlflow_versions.py pre-release ... #10679

Merged

44 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass in sourceId tag in all cases #10464

Pass in sourceId tag in all cases #10464

annzhang-db commented Nov 20, 2023 •

edited

github-actions bot commented Nov 20, 2023 •

edited

dbczumar Nov 27, 2023

annzhang-db Nov 29, 2023

dbczumar Nov 30, 2023

annzhang-db Nov 30, 2023

dbczumar Nov 30, 2023

dbczumar left a comment

dbczumar left a comment

Pass in sourceId tag in all cases #10464

Pass in sourceId tag in all cases #10464

Conversation

annzhang-db commented Nov 20, 2023 • edited

Install mlflow from this PR

Checkout with GitHub CLI

Related Issues/PRs

What changes are proposed in this pull request?

How is this PR tested?

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

github-actions bot commented Nov 20, 2023 • edited

dbczumar Nov 27, 2023

Choose a reason for hiding this comment

annzhang-db Nov 29, 2023

Choose a reason for hiding this comment

dbczumar Nov 30, 2023

Choose a reason for hiding this comment

annzhang-db Nov 30, 2023

Choose a reason for hiding this comment

dbczumar Nov 30, 2023

Choose a reason for hiding this comment

dbczumar left a comment

Choose a reason for hiding this comment

dbczumar left a comment

Choose a reason for hiding this comment

annzhang-db commented Nov 20, 2023 •

edited

github-actions bot commented Nov 20, 2023 •

edited